Prosper is America’s first peer-to-peer lending marketplace, with more than 2 million members and over $2,000,000,000 in funded loans.
## [1] 113937 81
The Prosper loan dataset contains 81 variables, with almost 114000 observations. However, not all of the variables are valuable to explore. To get enough information of both loans and borrowers, I would focus on variables for loans, such as loan amount, loan original date, estimated return of loan, current loan status, and variables for borrowers, such as borrowers Prosper rating, Prosper score, listing category,borrower APR, borrower income, borrower employment status, borrower occupation, borrower home owner, borrower credit history, borrower state, borrower total Prosper loans. The following give a glimpse.
## 'data.frame': 113937 obs. of 21 variables:
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : Factor w/ 1873 levels "2005-11-15 00:00:00",..: 426 1866 260 1535 1757 1821 1649 1666 1813 1813 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ProsperRating..Alpha. : Factor w/ 8 levels "","A","AA","B",..: 1 2 1 2 6 4 7 5 3 3 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory..numeric.: int 0 2 0 16 2 1 1 2 7 7 ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ IncomeRange : Factor w/ 8 levels "$0","$1-24,999",..: 4 5 7 4 3 3 4 4 4 4 ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ EmploymentStatus : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ Occupation : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
## $ IsBorrowerHomeowner : Factor w/ 2 levels "False","True": 2 1 1 2 2 2 1 1 2 2 ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ BorrowerState : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## LoanOriginalAmount LoanOriginationDate EstimatedReturn
## Min. : 1000 2014-01-22 00:00:00: 491 Min. :-0.183
## 1st Qu.: 4000 2013-11-13 00:00:00: 490 1st Qu.: 0.074
## Median : 6500 2014-02-19 00:00:00: 439 Median : 0.092
## Mean : 8337 2013-10-16 00:00:00: 434 Mean : 0.096
## 3rd Qu.:12000 2014-01-28 00:00:00: 339 3rd Qu.: 0.117
## Max. :35000 2013-09-24 00:00:00: 316 Max. : 0.284
## (Other) :111428 NA's :29084
## LoanStatus ProsperRating..Alpha. ProsperScore
## Current :56576 :29084 Min. : 1.00
## Completed :38074 C :18345 1st Qu.: 4.00
## Chargedoff :11992 B :15581 Median : 6.00
## Defaulted : 5018 A :14551 Mean : 5.95
## Past Due (1-15 days) : 806 D :14274 3rd Qu.: 8.00
## Past Due (31-60 days): 363 E : 9795 Max. :11.00
## (Other) : 1108 (Other):12307 NA's :29084
## ListingCategory..numeric. BorrowerAPR IncomeRange
## Min. : 0.000 Min. :0.00653 $25,000-49,999:32192
## 1st Qu.: 1.000 1st Qu.:0.15629 $50,000-74,999:31050
## Median : 1.000 Median :0.20976 $100,000+ :17337
## Mean : 2.774 Mean :0.21883 $75,000-99,999:16916
## 3rd Qu.: 3.000 3rd Qu.:0.28381 Not displayed : 7741
## Max. :20.000 Max. :0.51229 $1-24,999 : 7274
## NA's :25 (Other) : 1427
## StatedMonthlyIncome DebtToIncomeRatio EmploymentStatus
## Min. : 0 Min. : 0.000 Employed :67322
## 1st Qu.: 3200 1st Qu.: 0.140 Full-time :26355
## Median : 4667 Median : 0.220 Self-employed: 6134
## Mean : 5608 Mean : 0.276 Not available: 5347
## 3rd Qu.: 6825 3rd Qu.: 0.320 Other : 3806
## Max. :1750003 Max. :10.010 : 2255
## NA's :8554 (Other) : 2718
## EmploymentStatusDuration Occupation
## Min. : 0.00 Other :28617
## 1st Qu.: 26.00 Professional :13628
## Median : 67.00 Computer Programmer : 4478
## Mean : 96.07 Executive : 4311
## 3rd Qu.:137.00 Teacher : 3759
## Max. :755.00 Administrative Assistant: 3688
## NA's :7625 (Other) :55456
## IsBorrowerHomeowner CreditScoreRangeLower CreditScoreRangeUpper
## False:56459 Min. : 0.0 Min. : 19.0
## True :57478 1st Qu.:660.0 1st Qu.:679.0
## Median :680.0 Median :699.0
## Mean :685.6 Mean :704.6
## 3rd Qu.:720.0 3rd Qu.:739.0
## Max. :880.0 Max. :899.0
## NA's :591 NA's :591
## MonthlyLoanPayment BorrowerState TotalProsperLoans LenderYield
## Min. : 0.0 CA :14717 Min. :0.00 Min. :-0.0100
## 1st Qu.: 131.6 TX : 6842 1st Qu.:1.00 1st Qu.: 0.1242
## Median : 217.7 NY : 6729 Median :1.00 Median : 0.1730
## Mean : 272.5 FL : 6720 Mean :1.42 Mean : 0.1827
## 3rd Qu.: 371.6 IL : 5921 3rd Qu.:2.00 3rd Qu.: 0.2400
## Max. :2251.5 : 5515 Max. :8.00 Max. : 0.4925
## (Other):67493 NA's :91852
The univariate plots would be devided into two sections: one for loan, another for borrowers.
First, we plot the Loan original amount histograms for Prosper. The loan amounts skew to left, with the median 6,500 less than the mean 8,337. In this case, I rescale the x axis.
The loan original amount histogram is normal distributed. It peaks at 15,000 where loan original amount is 1,500, which means 15,000 loans’ amounts is $1,500.
Next, we plot the Loan original time histograms for Prosper. The data provided is allocate by date, to make it easier to find the trend, I reallocated the data by year.
There is a sharp dip in 2009, and not recover to previous level until 2011. It corelate with climax and end the 2008 global financial crisis. Since the last data collected on March 2014, there is a big drop in the histgram in 2014.
Next, we look at Loan status count:
Most of the data is completed, chargedoff, current or default, and there are seires of past due data. Later, I would analyze the data into 4 sections to make it easier to analyze. Next, we look at prosper risk rating.
We can easily find it follow the normal distribution. It peaks at C with over 20,000 count.
Higher rating is eqaul to higher score, we can analyze the prosper score next:
It also follow the normal distribution. Compare both rating and score, we can estimate the prosper loan risk is normal distributed.
Another interesting question is why people loan from prosper. We plot the histogram of the listing category below. Because the data provided only the identigy number, I would rename the properties by Variable Definitions file
## # A tibble: 5 × 1
## ListingCategory
## <fctr>
## 1 Debt Consolidation
## 2 Not Available
## 3 Other
## 4 Home Improvement
## 5 Business
The top 5 loan reason is Debt consolidation, not available, other and Business. Since we did not know what is the actual reason in the first 4, we will omit that and analyze other specific reasons.
We can find now the 5 top reasons of people’s loan is Home Improvement, Business, Auto, Personal Loan and Household Expenses. Most of the count of other variable is among 100 and 1,000.
Next, we will analyze the BorrowAPR, Borrower’s Annual Percentage Rate for loan. It indicate the yield rate for borrowers, decided by market rate and the urgency of borrowers.
We can easily find it is normal distributed. Average Borrower rate is 19.28%, which is pretty high. Since debt consolidation is the most, Prosper might combine several borrowers’ debt together to avoid too high a rate.
Then we analyze income range:
We can easily find it is normal distributed. We can easily find most of borrowers’ income is lower than 75,000, and the average income is among 25,000 and 49,999.
Then we analyze debt to income ratio:
If we move the outliers, we can find the distribution is normal, but skewed to the left. Most of the ratio is lower than 0.5, and it peaks at 0.2.
Next we will see the employees status
The majority of borrowers is employeed.
Then we find how long has they kept their employment status.
We can easily find the distribution is skewed to the left. Most of the borrowers kept the status less than 20 years. It also peaks near 1 year. It suggest people have instability employment status might borrow more debt.
Then we analyze borrowers’ occupation
Even though I have scale the y axis, the density still fluctuated widely. Despite the ambiguous “other” and “professianl”, we can find the computer programmer, engineer-mechanical and teacher is the top occupation among borrower.
Then we analyze how much were they home onwer.
We can easily find it is evenly distribute, we can not figure out any important information from here.
Then we analze the credit score lower range.
After clean the outliers, we can easily find majority borrowers have a lower credit score between 600 and 800. The average is lower than 700.
We can also analyze the upper range of credit score
We can find it has almost the same distribution but 20 higher than lower range. In the following analysis, I would use lower range to represent credit score.
Then we analyze the borrowers’ state. First, we transfer state abbreviation to the full name
Then we find the top 5 borrowers’ states and plot the state map:
## # A tibble: 5 × 1
## BorrowerState
## <fctr>
## 1 CA
## 2 TX
## 3 NY
## 4 FL
## 5 IL
We can easily find most of state has less than 5,000 borrowers. The top 5 oorrowers’ states were CA, TX, NY, FL, IL.
Next, we analyze the borrowewrs’ total prospers loans
We can find the count of borrowers decrease when number of loans increase. Most of the borrowers has less than 3 loans.
The data contains 113,937 rows and 81 different variables. The type of data I analyzed is int, num and factor.
The loan original amount and the borrwoerAPR are the main features of my interest. The formor decided how much money borrowers want and capable to borrow, the latter decided how much interests rate did borrowers need to pay for the debt. The Prosper would like the most amount of loan for borrowers who can pay back, and borrowers want to have the least APR to rent the same amount of debt.
I thought all the variables I picked would be helpful to find how Prosper decided their original loan amount and at which APR borrowers would like to debt.
Yes. The loan original date provided were precise to day, to make it easier to find the trend, I reallocated the data by year. The data of ListingCategory provided only the identigy number, I renamed then the by Variable Definitions file. The employment Status Duration were allocated by month, I reallocated it by year. I also transferred the abbreviation of state to the full name.
The histogram of loan original year showed a big dip among 2009. It is reasonal since there was the gloabl financial crisis. Most of the factor data I have reorder their levels for pretty plot. Most of number data I have cleaned the outliers to show the main part.
To get a glimpse, I would start with ggpairs to find out the relationship between the numeric parameters.
## NULL
Pick those absolute correlation higher than 0.4 to analyze. First we will analyze the loan original amount and monthly loan payment relationship:
We can easily find the monthly payment would increase when the loan origianl amount increase.
Then we will analyze the estimated return rate with borrower APR:
We can find the borrower APR increase when Estimated return increase.
Then we will analyze the Prosper Score with Borrower APR
We can find the borrower APR decrease when Prosper Score increase.
Then we will analyze the Credit Score with Borrower APR
We can find the credit score decrease when the borrower APR increase.
From the analysis above, we can find the key parameters are the borrower APR and the loan original amount, with whom other parameters have relation.
Then I would analyze other factors relation with them.
To simplify the analysis, I would use practical experience to reallocate those factors:
For loan status, we can set 4 case: In process contains current and Final payment in process; Past Due contains all Past Due; completed; Bad debt contains defaulted and chargedoff. However, since “completed” could used to be any other case, I would only analyze on the other.
For Loan Origination year, we can split it as pre 2009 and post 2009, due to the financial crisis.
Now make a pair comparement bewteen Loan original amount v.s. other factors
we can find all factors above affected OriginalAmount.
First we analyze the loan origianl amount and loan status
We can easily find there is a magin between different status. Most bad debt occupy more percentage at lower loan original amount,and the past due and in process debts occupy more percentage at higher loan original amount. It shows that if you have a higher loan original amount, you are more likely to pay the debt.
Next we analyze the stat summary of the data:
We can easily find bad debt had lower mean and median loan original amount compared to past due and in process.
I would analyze the diffence after Adding the occupation:
We can find the linear model fit well for both status. I would analyze in the Final plot 2 for details.
From here, we might also interested in why those have highest average income and least bad debt occupation would like to debt.
I would give a detail analysis in the Final plot 3.
Next I will analyze the original amount and prosper rating
Since AA data peak at 15000, the others have a trend with the loan original amount, we would exclude that to analyze again
We can find there are margins between different prosper rating. The higher rate debt occupy more percentage of higher loan origianl amount, the lower rate occupy more percentage of lower loan original amount. It shows that the higher your rate at Prosper, the higher amount you could borrow at Prosper. We can also find the AA rating is not depend on how much you borrow.
Next, I would analyze the stat summary of the data:
We can hardly find any difference of the mean and median of loan original amount among prosper rating B, A and AA, or between HR and E. However, as the Rate increase, the mean and median loan original amount increase.
Then I would analyze loan original amount with loan original year.
We can find there is a magin between pre-2009 and post-2009. Most pre-2009 debt occupy more percentage at lower loan original amount,and post-2009 debts occupy more percentage at higher loan original amount. It shows that borrowers debt more at Prosper after 2009.
Next, I would analyze the stat summary of the data:
We can find Pre-2009 had lower mean and median of loan original amount compared to Post-2009.
Since we have already analyze the relationship between BorrowerAPR and Prosper Rating before, we know focus on other two factors.
## [1] "BorrowerAPR" "status" "year"
we can find all factors above affect BorrowerAPR.
We can easily find there are quadratic relationship between borrower APR and loan status In process and past due. The Borrowewr APR closer to 0.2 will have more percentage of In process debt and past due, the Borrowewr APR closer to 0 or 0.4 will have more percentage of bad debt.
Next, I would analyze the stat summary of the data:
We can find In process had lower mean and median of BorrowerAPR compared to Past Due and Bad debt.
Then we will analyze borrowerAPR and loan original year.
We can easily find there are quadratic relationship between borrower APR and post-2009 debt. The Borrowewr APR closer to 0.25 will have more percentage of post-2009 debt, the Borrowewr APR closer to 0 or 0.5 will have more percentage of pre-2009 debt.
Next, I would analyze the stat summary of the data:
We can find Pre-2009 had lower mean and median of BorrowerAPR compared to Post-2009.
After all the pair comparement, we can also find how much average income each state has
We can easily find most state have average monthly income lower than 6,000. The richest state locate mostly near the sea. Connecticut was the only state had a higher than 7,000 average monthly income.
We can easily find most state have average monthly income lower than 60%. The highest home onwer percentage state locate mostly far away from the sea. Wyoming was the only state had a higher than 70% home onwer.
Next,
We can easily find most state have average loan amount higher than 7,000. district of columbia was the only state had a higher than 10,000 average loan amount.
When loan original amount increased, the monthly loan payment would increase, borrowerAPR would also increase. When loan original amount was lower, there would be more bad debt, lower prosper rating, and more pre-2009 debt. When borrowerAPR increased, the estimated return would increase, the prosper score would decrease, and the credit score would decrease. When the borrowerAPR is close the 0.25, there would be more debt in process, more post-2009 debt.
The state map with average income and percentage of homeowners of borrowers was the most interesting. If the state is on the coast, its borrowers would have higher average income but lower percentage of homeowners. If the state if far away from sea, its borrowers would have lower average income but higher percentage of homeowners.
The relationship between monthly payment and original total amount, its corelation is 0.93.
First, I would analyze the relationshiop between loan original amount, monthly loan payment and loan status. Since I had analyzed the relationship between Loan Original Amount and Monthly Loan Payment, here I would check whether add the loan status make any difference. To campare different models, we can use AIC to evaluate the performance of different Model. The better model would had a lower AIC
Model only consider LoanOriginalAmount and MonthlyLoanPayment, have AIC:
## [1] 846371.9
New Model also consider LoanStatus, have AIC:
## [1] 844085.8
We can find the new model’s aic is lower than original, which means add loan status make more sense in analysis. Next, I would plot the new graph to show the difference.
The current and Finalpayment in process is in the lower region, the Past Due and Bad debt is in the upper region.
Then, I would analyze the relationshiop between loan original amount, monthly loan payment and ProsperRating. Since I had analyzed the relationship between Loan Original Amount and Monthly Loan Payment, here I would check whether add the ProsperRating make any difference
Model only consider LoanOriginalAmount and MonthlyLoanPayment, have AIC:
## [1] 965172.5
New Model also consider Prosper rating, have AIC:
## [1] 961767.7
We can find the new model’s aic is lower than original, which means add ProsperRating make more sense in analysis.
Next, I would plot the new graph to show the difference.
The Monthly Payment decided by loan original amount, interests rate and payment duration. The interests rate would be influenced by market, so there are 3 region in the map, indicate 3 different market rates. In each region, we can find better loan status will have a lower slope between loan orignal amount and monthly payment, which indicated a longer duration for higher loan status. I would analyze the relationship of durantion and APR for different loan status.
We can find the loan status are different for different APR and N (Duration). I would give the detail of analysis in the Final plot 1.
Then, I would analyze the relationshiop between loan original amount, monthly loan payment and Loan Orination year. Since I had analyzed the relationship between Loan Original Amount and Monthly Loan Payment, here I would check whether add the Loan Orination year make any difference
Model only consider LoanOriginalAmount and MonthlyLoanPayment, have AIC:
## [1] 1276470
New Model also consider Loan original year, have AIC:
## [1] 1276138
We can find the new model’s aic is lower than original, which means add loan original year make more sense in analysis. However, the difference between two models is limited. Next, I would plot the new graph to show the difference.
The post-2009 is in the lower region, the pre-2009 is in the upper region. However, the difference is small and two line were close.
Then, I would analyze the relationshiop between BorrowerAPR, CreditScoreRangeLower and LoanStatus. Since I had analyzed the relationship between BorrowerAPR and CreditScoreRangeLower before, here I would check whether add the Loan Status make any difference
Model only consider BorrowerAPR and CreditScoreRangeLower, have AIC:
## [1] 809264
New Model also consider LoanStatus, have AIC:
## [1] 799817.3
We can find the new model’s aic is lower than original, which means add Loan status make more sense in analysis. Next, I would plot the new graph to show the difference.
The in process and the Past Due are in the upper region, the Bad debt is in the lower region.
Then, I would analyze the relationshiop between BorrowerAPR, CreditScoreRangeLower and prosperscore. Since I had analyzed the relationship between BorrowerAPR and CreditScoreRangeLower before, here I would check whether add the prosperscore make any difference
Model only consider BorrowerAPR and CreditScoreRangeLower, have AIC:
## [1] -215472.4
New Model also consider Prosper Rating, have AIC:
## [1] -413812.7
We can find the new model’s aic is much lower than original, which means add Prosperscore Ranking make more sense in analysis. Next, I would plot the new graph to show the difference.
we can find better loan status will have a lower borrowerAPR at same credit score.
Then, I would analyze the relationshiop between BorrowerAPR, CreditScoreRangeLower and prosperscore. Since I had analyzed the relationship between BorrowerAPR and CreditScoreRangeLower before, here I would check whether add the rosperscore and prosperscore make any difference.
We can find from in process to bad debt, the average prosper rating get lower, the borrowerAPR get higher and credit score get lower.
Loan status and Prosper rating differentiated the relationship between monthly payment and loan original amount and the relationship between credit score and borrower APR. Loan status and prosper rating strendthened each other at borrowerAPR from the last graph, since higher borrowerAPR indicated less likedly to pay back, where bad debt percentage is higher; For those who were more likely to have a bad debt, they would more likely to have a lower prosper rating.
The most interesting interaction is the propsperrating with borrowerAPR and credit score, we can find there are different layers for different prosper raing. The most surprising interaction is the loan original year with loan original amount and monthly payment. It seems a common sense that there should be a different between post-2009 and pre-2009, however, there were not such a big difference.
I have create general linear model to help me whether some of the factors could classify the data, and it shows that is true. The strength of model is that it can be calculated and proved, the weakness is that it is not straight forward as graph.
The first graph wants to dig in Prosper raing, loan duration and BorrwerAPR. In the preview graph, we analyze the monthly payment and loan origianl amount. However, since there might be difference of interests rate, the slope cannot figure out the difference between prosper rating. In this case, I calculate the duration “N” by the loan original amount, monthly payment and interest rate. Each number in n suggest the longth of year for debt payment. Then I facet the “N” to “Nrank”. Combined Nrank and APRrank, we can easily figure out the difference between debt In process with others. Most of the in process debt has the least APR and the longest duration. Most of the past due and bad debt has the highest APR and the shortest duration.
In order to find out who were more likely pay the debt with average income and occupation, I plot the secend graph. After rescale the average income axis, we can find the average income and percentage of debt status is simplr linear regression. The higher your average income, the higher percent you may pay back debt, the lower percent you may have bad debt. The top 5 least bad debt occupations are Judge, Dentist, Psycologies, Pilot and Attorney. The top 5 most bad debt occupations are all student of college.
The last graph wants to show why the top 5 least bad debt occupations wants to debt. Since we did not know what is the exactly reason people borrow money in “other”“,”not Available“” or “debt consolidation”, we would remove that from the table, even though that occupy over 50 percent of debt. For top 5 least bad debt occupations, we can find their payment mostly on business and home improvement.
The Prosper loand dataset contains 81 variables for over 110,000 lines data. I started from two sides, one descibe the information about loan, another about borrowers, and picked 20 variables. I explord their distribution and find those interesting question and lead to the core of the each sides: Loan origianl amount and Borrowers APR. I used correation and covariance to analyze the relationship between other variables and the core in Bivariate analysis section. Eventually, I use AIC to find whether add another variables would make the sense in multivariate analysis.
During the process, I used library dplyr to clean, subset, group and summerize data, ggplot to plot data, map to plot state map or GIS information, corrplot to show the correlation coefficient, colorbrewer to show the gradient difference between different levels of data.
There was a clear trend between the loan original amount and monthly loan payment, estimated return, prosper rating, loan original year and occupation. There was another trend between the BorrowerAPR and Prosper rating, borrowers credit range, loan status, loan original year and occupation. I struggled understanding the relationship among loan original amount, monthly loan payment and loan status, and search for the financial definition to generate a new variable called Duration, which help us easily distinguish who might have bad debt.
Till now, the analysis focus more on data exploration without building a predict or classify model. For further analysis, I can use general linear model to combine both numeric and factors variables for prediction and classification. Prosper can benefit from the model to find who might have more bad debt, how much should they set the origianl amount for different borrowers and how much would they earned from them.
How to order the (factor) variables in ggplot2 https://kohske.wordpress.com/2010/12/29/faq-how-to-order-the-factor-variables-in-ggplot2/
Prosper Loan Data - Variable Definitions https://docs.google.com/spreadsheets/d/1gDyi_L4UvIrLTEC6Wri5nbaMmkGmLQBk-Yx3z0XDEtI/edit#gid=0
US State Maps using map_data() https://www.r-bloggers.com/us-state-maps-using-map_data/
Split time series data http://stackoverflow.com/questions/13649019/with-r-split-time-series-data-into-time-intervals-say-an-hour-and-then-plot-t
Use corcolor to plot correlation http://www.sthda.com/english/wiki/correlation-matrix-a-quick-start-guide-to-analyze-format-and-visualize-a-correlation-matrix-using-r-software
colors in R http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 14393)
##
## locale:
## [1] LC_COLLATE=English_United States.1252
## [2] LC_CTYPE=English_United States.1252
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.1252
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] scales_0.4.1 RColorBrewer_1.1-2 corrplot_0.77
## [4] GGally_1.3.0 dplyr_0.5.0 maps_3.1.1
## [7] ggplot2_2.2.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.6 plyr_1.8.4 tools_3.3.1
## [4] rpart_4.1-10 digest_0.6.10 base64_2.0
## [7] htmlTable_1.7 evaluate_0.10 tibble_1.2
## [10] gtable_0.2.0 lattice_0.20-34 Matrix_1.2-6
## [13] DBI_0.5-1 yaml_2.1.14 gridExtra_2.2.1
## [16] cluster_2.0.4 stringr_1.0.0 knitr_1.15.1
## [19] nnet_7.3-12 rprojroot_1.1 grid_3.3.1
## [22] data.table_1.10.0 reshape_0.8.6 R6_2.2.0
## [25] survival_2.40-1 foreign_0.8-66 rmarkdown_1.3
## [28] latticeExtra_0.6-28 Formula_1.2-1 magrittr_1.5
## [31] backports_1.0.4 Hmisc_4.0-1 htmltools_0.3.5
## [34] splines_3.3.1 assertthat_0.1 colorspace_1.2-6
## [37] labeling_0.3 stringi_1.1.1 acepack_1.4.1
## [40] lazyeval_0.2.0 openssl_0.9.5 munsell_0.4.3